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Abstract: The informational odds ratio (IOR) measures the post-exposure odds divided by 
the pre-exposure odds (i.e., information gained after knowing exposure status). A desirable 
property of an adjusted ratio estimate is collapsibility, wherein the combined crude ratio 
will not change after adjusting for a variable that is not a confounder. Adjusted traditional 
odds ratios (TORs) are not collapsible. In contrast, Mantel-Haenszel adjusted IORs, 
analogous to relative risks (RRs) generally are collapsible. IORs are a useful measure of 
disease association in case-referent studies, especially when the disease is common in the 
exposed and/or unexposed groups. This paper outlines how to compute power and sample 
size in the simple case of unadjusted IORs. 
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1. Introduction 

A useful measure of association is the informational odds ratio (IOR) [1]. In the unadjusted case, 
the IOR is computed as (a/b)/(g/h), where a = number of exposed diseased individuals, b = number of 
exposed non-diseased individuals, g = number of diseased individuals, and h = number of non-diseased 
individuals (Table 1, Equation (1)). The IOR is equivalent to the post-exposure odds divided by the 
pre-exposure odds and is interpreted as an outcome measure of information gained after knowing 
exposure status (Equation (2)). The measure resembles the traditional odds ratio (TOR) (i.e., TOR = 
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(a/b)/(c/d), where c = number of non-exposed disease individuals, d = number of non-exposed 
non-disease individuals), except that the probability terms in the denominator are not conditioned on 
the absence of exposure. 

A key advantage of IORs is that the Mantel-Haenszel adjusted ratio estimates are collapsible {i.e., the 
combined crude ratio will not change after adjusting for a variable that is not a confounder) [1]. In contrast, 
adjusted TORs are not collapsible. IORs also are a useful and meaningful measure of association in 
case-reference studies of common diseases (e.g., obesity, diabetes) because their practical interpretation 
does not depend on estimating relative risk (RR) {i.e., rare disease assumption not required). 

Prior to conducting a study it is important to determine how large a sample is needed to be 
reasonable confident that estimates are precise and suitable for answering a priori hypotheses. 
Alternatively, one may specify the sample size and then compute the study power required to reject the 
null hypothesis given that it is false. This paper presents simple formulas for computing power and 
sample size for IOR. 



Table 1. A 2 x 2 contingency table. 



Disease — > 
^Exposure 


D 




D 


Total 


E 


a = 2,352 


b = 


1,600 


e = 3,952 


E 


c = 912 


d = 


1,600 


f= 2,512 


Total 


g = 3,264 


h = 


3,200 


i = 6,464 



The IOR is computed from the above 2x2 contingency table as: 

IOR = = 1.44 (95o/ 0 ci = 1.38 - 1.50) (1) 

where the 95%CI is based on the robust variance estimate for log (IOR) [1]. The equivalence between 
IOR and the post-exposure odds divided by the pre-exposure odds is shown below: 

/ P(D|E)P(E) \ / P(D|E) \ 
_ /a/b\ _ /a/g\ _ P(E|D) _ / P(D) \ _ / P(D|E) \ _ Post-Exposure Odds 

I0R ~ Vg/h/ ~ VbTh/ ~ P(E|D) ~ I P(D|E)P(E) I ~ I P(D) ) ~ Pre Exposure Odds (2) 

\ P(D) / V P(D) / 

2. Methods 



There exist two types of error in classical statistical hypothesis testing [2]. In the current context, a 
rejection error (also known as a "Type I" or "a" error) occurs when the null hypothesis (i/ 0 :IOR = 1) 
versus the alternative hypothesis {H A :lOR 4- 1) is falsely rejected, i.e., a = P(reject H 0 | H 0 is true). 
An acceptance error (also known as a "Type II" or "P" error) occurs when the null hypothesis is falsely 
accepted, i.e., [3 = P(do not reject Ho | H 0 is false). The "power" of the test of hypothesis is defined as 
(1 - P) and denotes the probability of correctly rejecting the null hypothesis, i.e., P(reject Ho | Hois 
false). The power also conveys the likelihood that a particular research design will detect a deviation 
from the null hypothesis given that one exists. Another important concept in experimental design and 
hypothesis testing is the "sample size" of a test. Sample size denotes the number (n) of experimental 
units (e.g., people, animals, widgets) needed to achieve a specified power at the a-level of statistical 
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significance. Several factors influence the sample size including Type I and Type II error, and the 
underlying variability of the sampling distribution. 

Power and sample size for IORs may be computed by a simple rearrangement of the general 
formulas for marginal risk ratios [3]. Letting pi = proportion of diseased individual who are exposed, 
po = proportion of non-diseased individuals who are exposed, r = ratio of non-diseased to diseased 
individuals, z a /2 = 100(1 - a/2) centile of the standard normal distribution, Zp = the standard normal 
deviate corresponding to P = (1 - power), it follows that Zp = [n-(pi - po) 2 -r/(r + l)-^-(l - Q] l/2 - Z a / 2 



and n = (Z^ + Z p ) 2 -^-(l - £)-(r + l)/(pi - Po) 2 % where £ = (pi + r-p 0 )/(l + r), and pi = p 0 TOR. 



Power then equals the probability that an observation from the standard normal distribution is less than 
or equal to Zp. The above formulas assume a log-normal distribution for IOR and the use of a robust 
variance estimate for the logarithm of IOR based on the delta method [1,3]. Many commonly available 
statistical packages provide routines for computing power and sample size for hypothesis tests 
involving unadjusted RRs. These routines may be adapted to compute power and sample size for IORs 
by transposing the input data matrix. Results from the examples below may be used to confirm that the 
input matrix was properly transposed. Slight differences in the results may be due to variations in the 
underlying sampling distribution, algorithms and/or numerical methods used by a particular statistical 
package. Under general admissibility conditions, computations will converge in distribution and yield 
asymptotically equivalent results [4]. 



Assuming an equal number of diseased (ni = 100) and non-diseased (n 0 = 100) individuals, the plot in 
Figure 1 gives power for IOR = (2, 3, 4, 5, 6, 7) for values of p 0 ranging from 0.01 to 0.10. For example, 
when p 0 = 0.04, the power to detect an IOR of at least 4.0 equals 80.7% at the a = 0.05 level of 
statistical significance. In Figure 2, we see that 200 diseased (and 200 non-diseased) individuals are 
needed to be sufficiently powered (>80%) to detect an IOR of at least 2.0 at the a = 0.05 level of 
statistical significance when the proportion of non-diseased individuals who are exposed equals 0.10. 

Figure 1. Power for IOR by proportion of non-diseased individuals who are exposed. 
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(Alpha = 0.05; No. non-diseased Pts. = 100; non-diseased: diseased ratio = 1.0). 
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4. Discussion and Conclusions 



An important property of adjusted IORs is their collapsibility and interpretability as an outcome 
measure of information gained after knowing exposure status {i.e., post-exposure odds divided by the 
pre-exposure odds). While TORs approximate RRs and collapsibility under the rare disease 
assumption, they lack this property in the case of retrospective studies of a common disease. 
The distinction of IORs versus RRs is that the former may be used in case-referent studies. This is 
because IORs do not depend on exposure (risk) margins but rather on column disease margins. 
However, because IORs still are a marginal estimate similar to RR, both estimates share the property 
of collapsibility. 

Based on the mirror relationship of IORs and RRs as marginal measures of association, the 
formulas used to compute power and sample size for RRs may be simply rearranged and applied to 
IORs. This is a particularly useful feature in practice given the availability of software for computing 
the power and sample size of RR estimates. 

The power and sample size formulas for IOR are based on asymptotic statistics and only should be 
used when the sample size is reasonably large and the sampling distribution for log (IOR) is 
approximately Gaussian. Similar to RRs, IORs are upwardly biased and actual power may be lower 
than the estimated one, at least for small sample sizes. Furthermore, the methods described for 
computing power and sample size apply to the simple case of unadjusted IORs and must be modified 
accordingly for more complex applications. 
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